中国邮电高校学报(英文) ›› 2013, Vol. 20 ›› Issue (6): 77-87.doi: 10.1016/S1005-8885(13)60112-0
林文辉1,雷振明2,刘军1,杨洁3,刘芳3,何刚1
摘要: We present an approach to optimize the MapReduce architecture, which could make heterogeneous cloud environment more stable and efficient. Fundamentally different from previous methods, our approach introduces the machine learning technique into MapReduce framework, and dynamically improve MapReduce algorithm according to the statistics result of machine learning. There are three main aspects: learning machine performance, reduce task assignment algorithm based on learning result, and speculative execution optimization mechanism. Furthermore, there are two important features in our approach. First, the MapReduce framework can obtain nodes’ performance values in the cluster through machine learning module. And machine learning module will daily calibrate nodes’ performance values to make an accurate assessment of cluster performance. Second, with the optimization of tasks assignment algorithm, we can maximize the performance of heterogeneous clusters. According to our evaluation result, the cluster performance could have 19% improvement in current heterogeneous cloud environment, and the stability of cluster has greatly enhanced.
中图分类号: